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ABSTRACT 

A method is developed for fitting theoretically predicted astronomical spectra to an 
observed spectrum. Using a hierarchical Bayesian principle, the method takes both sys- 
tematic and statistical measurement errors into account, which has not been done be- 
fore in the astronomical literature. The goal is to estimate fundamental stellar parame- 
ters and their associated uncertainties. The non-availability of a convenient determinis- 
tic relation between stellar parameters and the observed spectrum, combined with the 
computational complexities this entails, necessitate the curtailment of the continuous 
Bayesian model to a reduced model based on a grid of synthetic spectra. A criterion 
for model selection based on the so-called predictive squared error loss function is pro- 
posed, together with a measure for the goodness-of-fit between observed and synthetic 
spectra. The proposed method is applied to the infrared 2.38-2.60 /mi ISO-SWS data 
(Infrared Space Observatory - Short Wavelength Spectrometer) of the star a Bootis, 
yielding estimates for the stellar parameters: effective temperature T e g = 4230 ± 83 K, 
gravity log g= 1.50 ± 0.15dex, and metallicity [Fe/H] = —0.30 ± 0.21 dex. 

Key words: Methods: data analysis - Methods: statistical - Techniques: spectro- 
scopic - Stars: fundamental parameters - Stars: individual: Alpha Boo 



1 INTRODUCTION 

There are two general approaches to the observational study 
of stellar atmospheres: analysis and synthesis. Analysis en- 
tails measuring detailed features of the spectrum under in- 
vestigation and hence deducing the parameters of the stel- 
lar atmosphere. Synthesis implies specifying atmospheric 
parameters and calculating the resulting spectrum: when 
the synthetic and observed spectra agree sufficiently closely 
and/or in an optimal way, the parameters associated with 
the synthetic spectrum are taken as estimates for the star 
under consideration. Current applications of the synthesis 
technique in the astronomical literature are, however, ham- 
pered by the lack of a suitable objective method for decid- 
ing which one out of a pool of candidate synthetic spectra 
matches the observed one best. Often, the observed spec- 
trum is simply presented along with a "best" synthetic spec- 
trum without any mention of the fit criteria employed. Of- 
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tentimes, visual comparison is used, which may be adequate 
if the spectral region used is relatively short and contains 
only a few spectral lines. Such an eye-fitting method is in 
danger of failing when the observational data cover a large 
wavelength range, in which many atomic and/or molecular 
transitions occur. Moreover, when one wants to account for 
measurement errors, the task of deciding upon the "best" 
synthetic spectrum is even more complicated. 

Inferences for parameters of a stellar atmosphere using 
the synthesis approach consist of comparing the observed 
spectrum of the star with a collection of synthetic spectra. 
Let Q — (T e ff, log g, [Fe/H]) be the most important pa- 
rameters of the stellar atmosphere: temperature in Kelvin, 
gravity expressed on the log scale, and metallicity. Let M 
refer to the number of synthetic spectra in the grid. A syn- 
thetic spectrum, f?' m ' (m = 1,...,M) is identified by its 
value for f2, f2 (m) say. 

Previously employed frequentist parameter estimation 
and model selection for the spectrum are based on a 
goodness-of-fit statistic, T(j/,f7' m '), measuring the discrep- 
ancy between observed and synthetic spectra. Kolmogorov- 
Smirnov t est statistics and residua l sum of squares are dis- 
cussed in iDecin et~al1 l|2000l . 12004 ). Both methods use the 
value of Q' m ' minimising T(y,(^ m ') as an estimate for fi. 
However, two extra complexities render a paradigm shift 
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Table 1. Symbols used in the proposed Bayesian method. 



symbol 


meaning 


V 


observed spectrum 


e 


synthetic spectrum 




"true" spectrum 




SPARE-tag 


a 2 


STDEV-tag 


n 


triplet of stellar parameters 


P(l*\V,0,o a ,ol t ) 


likelihood function 
spectrum's prior distribution 
spectrum's posterior distribution 
goodness-of-fit score for model selection 



a sensible approach, away from frequentist and towards 
Bayesian methods. Of course, this assertion does not im- 
ply the Bayesian paradigm should be deemed in any way 
superior over the likelihood and/or frequentist paradigms. 
First, the analysis presented in Sect. [7] reaches a very high 
level of agreement between observed and theoretical data 
sets. A proper inclusion of both systematic and statistical 
measurement errors in the model selection and parameter 
determination procedure is then in its place. Second, the 
computation of the theoretical data takes many CPU-hours, 
rendering the calculation of a huge grid of theoretical spectra 
unfeasible. 

Here, we present an objective tool, based on hierarchical 
Bayesian ideas, for measuring the goodness-of-fit between 
observational and synthetic spectra, at the same time incor- 
porating the statistical and systematic measurement errors. 
Precisely, the reason for choosing the Bayesian paradigm is 
the ability to combine the observed spectra with prior knowl- 
edge. Such prior knowledge, termed expert priors, originates 
from the theory of and empirical knowledge gathered about 
stellar atmospheres. The proposed method is suitable for es- 
timating stellar parameters, other than the ones presented 
here. For readers not used to Bayesian statistics, the main 
principles are outlined in Sect. [2j supplemented with key 
references. 

Sect. [3] introduces the data setting. A hierarchical 
Bayesian model is presented in Sect. [4] while the tasks of 
calculating the prior distribution and model sel ection issues 
are dis cussed in Sects[S]and|Sl respectively. As in lDecin et al.l 
|2004h . we apply our method to the case study of the 2.38- 
2.60 /im ISO-SWS spectrum of the K2IIIp star Alpha Bootis 
(Arcturus, HD 124897). Sect. [7] is devoted to the applica- 
tion. In Sect. [8] we compare the results as obtained from 
the Bayesian methodology with other studies. 



2 BAYESIAN INFERENCE 

2.1 Bayes' theorem and marginalisation 

To support understanding in this and subsequent sections, 
Table [T] presents the main symbols used. 

Similar to the frequentist inferential approach, the 
Bayesian paradigm is based on observations, y, taken with 
uncertainty and assumed to be sampled from a population 
distributed according to a probability distribution function, 



P(y\n). While within the frequentist framework a param- 
eter 7r is assumed to be an unknown constant, inference 
then being based on the sampling distribution of the data 
given the parameter, i.e., the likelihood function P(y\n), the 
Bayesian approach entertains the idea that tt is a random 
variable with a so-called prior distribution, P(tt) and with 
inference proceeding based on the conditional distribution of 
the parameter given the data P(ir\y), the so-called posterior 
distribution. The latter follows from the prior distribution 
and likelihood function combined, using Bayes' theorem (Eq. 
{TJ) and the concept of marginalisation (Eq. (O): 

P(«\y) = Piv % P{ *\ (i) 

and 

+oo 

P(y)= J P(y\7v)P(n)dn. (2) 

— oo 

The prior probability represents our state of knowledge 
about the distribution of the parameter before we analyse 
the data. This knowledge is modified by the experimental 
measurements through the likelihood function, producing 
the posterior distribution. When omitting P(y) from Eq. {I}, 
one writes P(n\y) oc P(y\n) x -P(tt). This is fine for many 
statistical inferences, such as parameter and precision es- 
timation. However, when model selection is envisaged, the 
term P(y), often termed evidence, is vitally important. 

2.2 Some examples 

2.2.1 Example 1 

Consider a single observation, y, from a normal distribution 
with mean 8 and known variance a 2 . The likelihood in this 
case is 

P(V\0) = - J= exp (± { y - 9f) oc exp - Of) . 

Assuming further that 9 is normally distributed with mean 
and variance r 2 , the prior model is 

iGelman et ail (|l995h derived the posterior distribution to be 
P(%)ocexp^(c?-r7) 2 ), (3) 

which is a normal distribution with mean r) and variance 
S 2 . We return to the parametric structure of r\ and S 2 in 
Sect. 



2.2.2 Example 2 

Consider a sequence of n Bernoulli, i.e., 0/1, trials yi, . . . , y n , 
with probability of observing 1 equal to 9, and let y = 
E" =1 yi. The resulting binomial likelihood is given by 

p(y\e) cx e v (i - e) (n ~ v) , 

and the (frequentist) maximum likelihood for the success 
probability 9 is 9ml = y/n. Suppose we specify the prior 
distribution for the success probability to be Beta: 9 ~ 
Beta(a, (5), then 

P(8) <x8 a - 1 {l-9f-\ 
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Then, the prior mean for 8 is a /(a + (3). The posterior dis- 
tribution of 9 then is: 

p(%) oc e^-^ii - 9) {n - y+l3 - 1) , 

which is, again, a beta distribution, 9\y ~ Beta(a + y, (3 + 
n — y), with posterior mean 

E (0\y) = a + V . 
y 1 1 a + /3 + n 

To illustrate this model further, assume 6 successes were 
obtained out of 10 trials and suppose that we specify a 
non-informative prior 9 ~ U(0, 1) (hence, a = j3 = 1 
since Beta(l, 1) is a uniform distribution over the interval 
[0, 1]). In this case, the maximum likelihood estimate for 9 
is 9 — 0.6 when using the classical frequentist methodology, 
while the Bayesian analysis results in a posterior mean of 
9 = 7/12 = 0.583. In other words, since the posterior mean 
is obtained by pulling the maximum likelihood value 0.6 to- 
wards the prior mean of the (7(0, 1), which equals 5. The 
larger the sample size, the less the importance of the prior 
distribution. 



2.2.3 Hierarchical models 

The above examples can be formulated as hierarchical mod- 
els in which the likelihood and the prior are specified at the 
first and the second level of the model. At the third level of 
the model we specify the probability model for the hyperpa- 
rameters r 2 and a 2 , F T and F a , which are called hyperprior 
distributions. Hence, we obtain e.g. for example 1: 

y~N(8,a 2 ), 1st level, 

~ iV(/i,T 2 ), 2nd level, 

r 2 ~ F T and a 2 ~ F CT , 3rd level. 



^-P^l^- 1 ',...,^- 1 '), 



I'd 



0) -P(/ia|/i?\. 



Repeat the second step until convergence. 



Assuming that the sampling process is converged after L 
iterations, the posterior mean of /i can be estimated by 
MCMC integration: 



L „ (£) 



Note that JJn is simply the sample mean of which is ob- 
tained after L iterations of the Gibbs sampling. In our set- 
ting fli is the posterior mean of the spectrum at wavelength 
i. 

One of the quantities of interest will be T^ m \y, //) 
(see Sect. 16. ip . In practice, if we draw L simulations from 
the posterior distribution of [i we can monitor the value of 
T^ m ' (y, //) for each iteration, £ — 1,2, ... ,L and the poste- 
rior mean of T (m) (j/,/) is simply 1/L £ f L =1 T £ (m) (y,fi). 

It is important to realise that the Bayesian method is 
typically based on fully specifying the likelihood function, 
together with a prior distribution. These, combined with the 
data, produce the posterior distribution and ultimately sta- 
tistical inferences. When analytic computations are deemed 
too cumbersome, one may then elect MCMC computations 
instead. Such a switch does not change the parametric na- 
ture of the assumptions made and hence the MCMC im- 
plementation is fully parametric. Furthermore, in many in- 
stances, like the one considered here, opting for normal dis- 
tributions greatly simplifies computations. 



2.3 Posterior inference 

As was explained in Sect. 12. ll inference is based on the pos- 
terior distribution of the unknown parameters in the model 
given the data P(n\y). This distribution can be derived an- 
alytically (as in the above example) or may have to be ap- 
proximated using the so-called Markov Chain Monte Carlo 
(MCM C) algorithm. A single iteration of the MCMC algo- 
rithm dGilks et al.lll996T ) consists of sampling the unknown 
parameters in the model from their full conditional distri- 
bution, given the current value of the other parameters in 
the model and the data. Assume that the distribution of in- 
terest is P(/x), where /i = (fii, . . . ,Hd). We denote the full 
conditional distribution of fii given all other parameters by 

P(fXi\lM-i). 

One way to implement the MCMC algorithm is through 
the well-known Gibbs sampling algorithm (Gil ks et al.l 
1 19961 ). the steps of which are as follows: 

- Step 1: 

Initialize the iteration counter of the chain (j = 
1) and the initial values for the parameters ^i' ' = 

(Mi°\ 

- Step 2: 



,/4 0) ) 



Draw a new value \i 3 = 



CO 



through successive 



sampling from the full conditional distributions: 



3 OBSERVED AND SYNTHETIC SPECTRA 

Let us discuss the observational data setting and the concept 
of synthetic spectra in turn. 



3.1 Observational data y 

The observ a tional data for a Boo, also considered in 
iDecin et al.l |2004) consist of near-infrared (2.38-2.60 /im, 
band 1A) spectra, o bserved with the SWS (Short Wave- 
length Spectrometer, Ide Graauw et al" Ill996h on board ISO 
(Infrared Space Observatory, Kessler et al.l 1996). Prior to 
the st atistical analysis, data reduction techniques are ap- 
plied (|Decin et al.l 12004 ). Bands are combinations of detec- 
tor array, aperture and grating orders such that for each 
band its detector array sees a unique order of light, and 
hence a unique wavelength A. Band 1 (2.38-4.08 (im) is sub- 
divided in 4 sub-bands: band 1A: 2.38-2.60 /im, band IB: 
2.60-3.02 ^m, band ID: 3.02-3.52 ^m, and band IE: 3.52- 
4.08 fim. The same resol ution and fa c tor sh ift for band 1A 
are used as in Table 1 of lDecin et al. (2004). Let us turn to 
the uncertainties and errors in the data. 

The error propagation of the SWS pipeline separates 
statistical errors from systematic ones. The so-called sta- 
tistical 'STDEV-tag' a contains the standard deviation of 
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the points in a given bin, and the systematic 'SPARE-tag' 
om captures the effect of the imperfect performance of the 
instrument. The 'SPARE-tag' of the ISO-SWS data corre- 
sponds to statistical accuracy, i.e., how well systematic er- 
rors can be controlled, closeness between the result of an 
experiment and the true value, while the 'STDEV-tag' cor- 
responds to the precision, i.e., how well the random errors 
can be controlled. The errors a and om have the same order 
of magnitude (Fig. [TJ . 

While the 'SPARE-tags' are almost the same for all ob- 
servations of all target stars observed by the satellite, the 
'STDEV-tag' discriminates between the quality of the data 
points. Assume a normally distributed model for the ob- 
served spectrum yi at wavelength i, i = 1,2, ...,n with 
mean E(yj) = representing the true spectrum of the 
target, possibly including systematic instrumental artifacts, 
and statistical measurement error variance Oi, then a normal 
model 

yi = fii+£i, (4) 
where Ei ~ N(0, of) is assumed. 



P(y\v,° 2 ) = Yl4>(yi\^,°i) , (5) 

where <f) is the density of the normal distribution with pa- 
rameters n and a 2 . Assume that the mean of the observa- 
tional data at wavelength i, fii, follows a normal distribution, 
i.e., 

m = 9i + Ui, (6) 
with Ui ~ N(0,a M .) and a Mi the systematic observational 
error. Following JBJ, we assume that, owing to the system- 
atic errors, the true spectrum is distributed around 8i with 
variance a Mi ■ It follows from (0 that the prior distribution 
is given by 

n 

PM8,a 2 M )=l[^\e t ,a 2 Mz ). (7) 

Then, the spectrum's posterior distribution is 
P(tJ-\y,v 2 ,<?li,Q) oc P(y\fi,a 2 ) ■ P(/j,\8,(T M ) 

n n 

j=i »=i 



3.2 Synthetic data 9 

A synthetic stellar spectrum is computed from first-principle 
physics laws governing the stellar atmosphere. For a full 
description of this stud y's synthetic spectra we refer to 
iDecin et al.l (|2000l . l2004f ) . It is very important to note that 
the functions of interest are of a continuous nature, yet they 
will be treated in a discretized way, for reasons of numerical 
feasibility. Indeed, the synthetic spectra calculations require 
a model atmosphere as input, which is obtained through 
lengthy calculations, taking several hours, in order to ob- 
tain hydrostatic equilibrium and to fulfill the conservation 
law of radiative (and convective) energy. When this would 
not have been the case, i.e., when we could have written 
^i(A) = /i(T e ft, log g, [Fe/H], A), with h representing a closed 
analytical function, then we could have estimated T e ff, log 
g and [Fe/H] directly from the observational spectrum. We 
circumvent the absence of a closed form for the spectrum 
by considering a dense grid of synthetic spectra 8, with the 
goal of providing appropriate error estimates. 

Subsequently, we rely on hierarchical Bayesian mod- 
els for spectrum fittin g, fo llowing the idea propos ed by 
lLaud fc Ibrahim! (|l995f ) and lGelfarid fc Ghoshl l| 19981 ). who 
suggested comparing the observed data (y) and hypotheti- 
cal data, termed replicated data, sampled from the posterior 
predictive distribution, by minimising a predictive discrep- 
ancy measure (Sect. [6}. 



4 HIERARCHICAL BAYESIAN MODEL FOR 
THE SPECTRUM 

Applying Bayes' theorem produces the spectrum's poste- 
rior distribution, as outlined in Sect. [2] Precisely, from our 
knowledge of 6 and y, we predict fi, i.e., we derive its pos- 
terior. For a "bad" synthetic spectrum 8, the observational 
data y and the predicted /i will differ by a relatively large 
amount. 

Using , the likelihood of the model parameters given 
the data equals 



5 POSTERIOR DISTRIBUTION FOR THE 
SPECTRUM 

5.1 The full model 

The above specifications are sufficient to define the posterior 
distribution of all model parameters jointly: 

P([i,8,a 2 ,a 2 M ,Q\y) 

111,13 9 9 

L ix LJ P(y\y,a 2 \ X P(ti\8,a 2 M \ 

likelihood, Eq. J5j prior, Eq. (7) 

x P{8\Q.) x P(fi) . (9) 

distribution of the prior mean 8 hypcrprior 

We still need to specify the hyperpriors for P(T e ff), P(log g), 
and P([Fe/H]). A literature study for th e stellar atm o spher e 
parameters of a Boo was presented in IDecin et al.l £2000), 
who found that T er r ranges from 4060 K to 4628 K, log g 
from 0.90 to 2.60 dex, and [Fe/H] from -0.77 to 0.00 dex, 
based on which we construct the hyperprior distributions. 
Further discussion on the choice of the grid parameters and 
the uncertainties thereon is relegated to Sect. [7] 

After establishing P(£l), P(8\Q) is needed to complete 
the specification of the hierarchical model. Since there is 
no deterministic relationship between 8 and Q, we cannot 
specify the mean of the prior distribution using standard 
methods, including linear, generalised linear, or non-linear 
models, for example. This implies the need to adopt a two- 
stage approach with calculation of a collection of models for 
the synthetic spectrum over a grid of discrete values in SI, 

, . . . , fi (M) , followed by usage of these models 8 {m) (m = 
1, . . . , M), as the prior mean of /j, in (0 . In this approach, the 
value of £1, given the data, is not estimated with the posterior 
means of the hyperprior distributions, but rather we select 
models from the collection calculated in the first stage. Thus, 
our two-stage approach implies a model selection procedure 
ought to be used to select the 'best' synthetic spectrum. 
This issue is discussed further in Sect. [6] 
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5.2 The reduced model 



For the mth combination of Q, we calculate (m) and con- 
sider a reduced posterior distribution 



(m) 



q(m)> 



where P(/i|i/, a 2 , af 



oc P(y\fJ,,t 

oc P{n\y,o*,<jl I ,0 K7n >), (10) 
6^ m ') is the posterior distribution of 



■P(f40 

(m) 



M 2 



the spectrum given Qf- m \ and #' m ' is the prior mean of /i 
as in J7}. Since, for the mth combination P(6» (m) \tt (m) , m) = 



1, passing from (|9]) to (|10f) is straightforward. 



(a): log(STDEV-tag A 2) in band 1A 



(b): log(SPARE-tag A 2) in band 1A 



5.2.1 Specification of the reduced model 

We focus on the posterior distribution of the spectrum \i at 
wavelength i given yi, 6$ , at, and <JM i - For the remainder 
of this section we drop superscript m and subscript i. Since 
the prior in (J7J) is conjugate to the normal likelihood in ©, 
the posterior distribution of the spectrum is normal as well. 
Formally, the likelihood and the prior can be expressed by 



P(y\/J.,a ) oc exp 
and 



P(v\9,°m) oc exp -__(ju-0) 



1 



Mi 



(11) 



(12) 



respectively. It follows from (|11|) and (|12f) that the posterior 
distribution of is 

1 



, a 2 ,a 2 M ) oc exp 



2r5 2 



(13) 



which is a normal distribution with mean #i and variance 
5 2 given by 

9i = ^ ■ 7 -i = J_ + _L (14 ) 



rJ 2 



T 2 



This model is discussed in detail by iGelman et al.l (|l995h . 
The result in (I14|l means that the posterior mean of the 
spectrum 8± in (|13p is a weighted average of the synthetic 
spectrum and the observed spectrum. It can be shown that 



(y-0) 



(15) 



u +<T A M 

where the second factor on the right hand side is a shrinkage 
factor. Hence, if the SPARE-tag auti containing the sys- 
tematic measurement error, is relatively large compared to 
the STDEV-tag, en, containing the statistical measurement 
error, the posterior mean of the spectrum at wavelength i 
shrinks towards the observed spectrum at wavelength i. In 
the reverse case, the posterior mean of the spectrum shrinks 
towards the synthetic spectrum. 



5.2.2 Contracting the variance function 

Clearly, the variance parameters cr 2 and a\j. are unknown 
and need to be estimated. Fig. [1] displays the measurement 
errors in band 1A (on the log scale). The shrinkage ratio, 
o- 2 m), is shown in panel c. Note that for wave- 
lengths smaller than or equal to 2.4 (im the mean of the 
shrinkage ratio is 0.5 while for wavelengths greater than or 
equal to 2.58 /im the mean of the shrinkage ratio increases 
to 0.87. This means that at the beginning of band 1A the 



2.40 2.45 2.50 2.55 2.60 
wavelength[micron] 

(c): shrinkage factor in band 1 A 



2.40 2.45 2.50 2.55 
wavelength[micron] 



2.40 2.45 2.50 2.55 
wavelength[micron] 



Figure 1. Measurement errors in band 1A. Panel a: statisti- 
cal log(STDEV-tag) 2 (log(cr 2 )). Panel b: systematic \og(SPARE- 
tag) 2 (log(<T 2 1 -)). Panel c: the shrinkage ratio a 2 M /{a 2 -j-a 2 ^). 



posterior mean is an average between the observed and syn- 
thetic spectrum, while the weight of the observed spectrum 
increases with the wavelength. 

To model the variance components, we cons ider an 
empirical Bayesian approach (|Carlin fc Louislll996l) . Using 
the estimates for the measurement error, we first spec- 
ify a model for a 2 and a\j. , estimate the parameters, 
and then plug in predicted values into the model. Specifi- 
cally, we smooth the data using a hie rarchical linear mixed 
model l|Verbeke fc Molenberghsll2000l ). allowing to estimate 
a smooth function for the variance components in a flexible 
fashion. For a, we assume 

logK 2 ) ~ N(X>v + Z lU ,Sl), (16) 

where Xi and Zi are known design matrices, v are regres- 
sion coefficients, and u = (ui,...,uk) are random effects 
assumed to follow itfe ~ N(0, 8 U ) (k — 1, . . . , K). A similar 
model was assumed for om- Details can be found in IShkedvl 
(2003). In this approach, we use the estimated smooth func- 
tions as variance components of the reduced model. Such a 
smooth function allows the data to dominate the posterior 
mean at the end of band 1A, where a is relatively small 
relative to om- The application to a Boo is presented in 
Sect. 
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6 MODEL SELECTION 

6.1 Measures for the goodness-of-fit 

Using (110[), we predict /i from o ur k nowledge on y and _ . 
Following iGelman et all (|l995h and lCarlm fc Louisl (|l996h . 

a weighted x 2 goodness-of-fit measure, given by 



var^l/i,,^)) • 



(17) 



can be used. T {m) (y,v) measures the discrepancy be- 
tween the observed data y and the expected mean, rela- 
tive to the variability in the model. Both am and a in- 
fluence T ( - m \y,jj,), since a 2 — var(y|/i), and because \i ~ 
N(6 ( - m \a% I ), the denominator depends on both quantities. 

In our application, we will c ompare the perfor mance 
of T {m \y,n) with the results in iDecin et"afl (|2000l ), who 
used a frequentist version of ((17} that is unable to take the 
observational errors into account. Note, however, that within 
the Bayesian framework (y, fi) is not used as a criterion 
for model selection but rather as a measure for the model 
goodness-of-fit. 

6.2 Posterior predictive distribution 

Criteria for Bayesian mode l selection are d i scusse d in 
lLaud fc Ibrahim! <|l995l ) and IGelfand fc Ghoshl l| 19981 ). all 
based on the posterior predictive distribution. 

Let t/i be the observed data at wavelength i and fj,i 
the current value of fii at the Ith MCMC iteration. Then, 
we simulate n hypothetical replications from the data given 
the current value of \x\ and denote these values by y^ ep 
(i — 1,2, ... ,n). From these n replicates P(y rep \fi, 9, y) is 
constructed. Formally, the posterior predictive distribution 
is given by 



P(y rep ,H, 6) dfj.de 



p(y rep \f,,e, y )P(ii,e\ y )d^de. (is) 



P(y rep \y) ® 



For each replicated sample, obtained from (|18ft , the observed 
data and the posterior predictive distribution are compared. 
If the mth synthetic spectrum is sufficiently accurate, the hy- 
pothetical replication and the observed data are considered 
sufficiently similar. 

6.2.1 Predictive model selection under squared error loss 

A good model for the synthetic spectrum, among the mod- 
els under consideration, should render a prediction close to 
what has been observed. Thus, a synthetic spectrum model 
leading to a small discrepancy between the replication and 
the observed data is considered a viable description of the 
data. A measure fo r the discrepancy, based on squared error 
loss is proposed by lLaud fc Ibrahim! (| 1995h : 



Ll = E[(y™ p - y) T (y^ p - y)] = E^T* ~ Vif 



(19) 



where a su perscript T refers t o tra nspose. lLaud fc Ibrahim! 
^) and lGelfand fc Ghoshl l| 19981 ) showed that L 2 m can be 



(199 



expressed as a sum of two terms: 

n 

L 2 m = J2lE(yr - ytf + var^)] = G(m) + P(m). (20) 



Here, G(m) measures the goodness-of-fit and P(m) is a 
penalty measuring model complexity. The latter is the same 
for all synthetic spectra as they are calculated with the same 
num ber of parameters . L 2 ^ ca n no w be used for mode l selec - 
tion. lLaud fc Ibrahiml (|l995l ) and IGelfand fc Ghoshl (|l998h 
suggested selecting a model from a collection of M can- 
didates by minimising the expected squared error loss of 
the replicated data. H ence, the procedure proposed by 
IGelfand fc Ghoshl (|l998h requires calculation of L m over the 
model collection: 



(m) 



Vi) 



+E 



2(m) 



(21) 



where a 2 ^ — vax(yl ep \y,m) and rj^ = E(yl ep \y,m). In 
our setting, ^ (m) = E(y^ ep \y, 6 <m) ). If we assume that both 
o"i and cm 4 are known, then the model minimising G(m) is 
selected, otherwise the model that minimises L 2 ^ is selected. 

A schematic representation of the various model build- 
ing and selection steps is presented in Figure [2] 



7 APPLICATION: THE CASE STUDY OF a 
BOO 

We apply the Bayesian method as developed above to the 
case study of the 2.38-2.60 )im ISO-SWS spectrum of the 
metal-deficient K2III peculiar giant a Boo and compare the 
newly obtained results wit h other frequentist studies, in par- 
ticular with the results of IDecin et all (|2004h ■ The same set 
of synthetic spectra has been used by these authors, i.e., a 
grid over discrete values in fl = ( T e g, logg, [Fe/H]), with 
parameter values l|Decin et al.ll2000l ): 

T cff :4160 K, 4230 K, 4300 K, 4370 K, 4440 K 
log g :1.20, 1.35, 1.50, 1.65, 1.80 
[Fe/H]:0.00, -0.15, -0.30, -0.50, -0.70. 

As in lDecin et al.l l|2004h . other parameters needed to com- 
pute a proper spherically symmetric atmosphere model and 
synthetic spectrum were kept fixed: the abundance of car- 
bon e(C) = 7.96 ± 0.20, nitrogen e(N) = 7.61 ± 0.25, and 
oxygen e(O) = 8.68 ± 0.20, and the microturbulent veloc- 
ity £t = 1.7 ± 0.5 km/s). Each synthetic spectrum is used as 
a prior mean in the hierarchical model of Sect. 15.21 There 
are 125 models in total, labelled by an (arbitrary) model 
number, as listed in Table [2] A proper angular diameter 
was calculated for each model in the grid using Eq. (1) in 
IDecin et af] (120041 ). The derived values are listed in Tabled 
Decin et all (|2000l ) derived an initial value for 
=(T eff =4320 ± 140 K, log g=1.50 ± 0.15 dex, and 
[Fe/H] = — 0.50± 0.20 dex), where the uncertainties on the 
derived parameters were guessed from (a) intrinsic uncer- 
tainties on the spectra (i.e., the ability to distinguish be- 
tween different synthetic spectra at a specific resolution), 
(b) the quality of the data, (c) the values of the non- 
local Kolmogorov-Smirnov test statistic, and (d) the dis- 
crepancies between observational and synthetic spectra. As 
such, the estimated mod el parameters and their uncertain- 
ties in lDecin et"all (|2000l ) for the ISO-SWS data are model- 
dependent external values. We merely use these results to 
define the values for our grid parameters and for their spac- 
ing. 

Let us now properly include both statistical and sys- 
tematic observational errors using the Bayesian approach. 
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Figure 2. Schematic representation of the model building and selection steps' sequencing. 



This will enable definition of a parameter range for T e ff, log 
g, and [Fe/H], and selection of the optimal model within 
the model ensemble specified. The analysis will take points 
(a)-(d) into account in a mathematically principled way, 
providing us with model-dependent (error) estimates. How 
to calculate internal model- dependent error estimates is the 
subject of Sect. 17.31 In addition, the uncertainty about the 
model itself, reflected in the so-called between-model vari- 
ability, is accounted for and combined with the internal, or 
model-dependent, variability, thus producing a measure of 
total variability. Simultaneously accounting for both sources 
properly reflects the true variability and hence produces 
standard errors wider than those obtained, for example, by 



iGriffin fc Lvnas-Gravl (|l999h . This is extremely important 
to avoid the risk of basing conclusions on noise rather than 
on signal. 

For each model, an MCMC simulation (see Sect. I2.3|l 
with 10,000 iterations, the first 5000 of which used as burn- 
in, was used to calculate the posterior mean of /j, and 
T (m) (y, 9). Indeed, when applying MCMC, one typically ac- 
counts for the fact that the sequence takes some time before 
converging to the true posterior dis tribution by discarding 
its initial portion (jCilks et al.lll996h . When in doubt as to 
how many iterates should be chopped off, it is prudent to 
choose a relatively high number. The variance functions are 
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Table 2. Steller angular diameters (expressed in milli-arcseconds) and model numbers (in between brackets) associated with the different 
model parameters of the grid of synthetic spectra. 



Teff [K] 



log g 4160 4230 4300 4370 4440 

1.20 21.16 (1) 20.95 (26) 20.72 (51) 20.51 (76) 20.27 (101) 

1.35 21.20 (6) 21.05 (31) 20.81 (56) 20.59 (81) 20.31 (106) 

1.50 21.23 (11) 21.09 (36) 20.85 (61) 20.62 (86) 20.34 (111) [Fe/H] = -0.70 

1.65 21.26 (16) 21.11 (41) 20.87 (66) 20.64 (91) 20.36 (116) 

1.80 21.28 (21) 21.06 (46) 20.98 (71) 20.60 (96) 20.38 (121) 

T20 TLW (2) 20.96 (27) 20.73 (52) 20.51 (77) 20.28 (102) 

1.35 21.20 (7) 21.03 (32) 20.80 (57) 20.57 (82) 20.32 (107) 

T50 21.23 (12) 21.06 (37) 20.82 (62) 20.60 (87) 20.34 (112) [Fe/H] = -0.50 

1.65 21.26 (17) 21.08 (42) 20.84 (67) 20.62 (92) 20.37 (117) 

1.80 21.28 (22) 21.06 (47) 20.83 (72) 20.61 (97) 20.54 (122) 

T20 21~T6 (3) 20.96 (28) 20.73 (53) 20.52 (78) 20.28 (103) 

1.35 21.20 (8) 21.01 (33) 20.78 (58) 20.56 (83) 20.32 (108) 

1.50 21.23 (13) 21.04 (38) 20.81 (63) 20.59 (88) 20.35 (113) [Fe/H] = -0.30 

1.65 21.26 (18) 21.06 (43) 20.83 (68) 20.61 (93) 20.37 (118) 

1.80 21.27 (23) 21.06 (48) 20.83 (73) 20.78 (98) 20.40 (123) 



1.20 21.16 (4) 20.96 (29) 20.74 (54) 20.52 (79) 20.29 (104) 

1.35 21.20 (9) 21.00 (34) 20.77 (59) 20.55 (84) 20.32 (109) 

T50 21.23 (14) 21.02 (39) 20.79 (64) 20.57 (89) 20.35 (114) [Fe/H] = -0.15 

TEE 21.25 (19) 21.04 (44) 20.82 (69) 20.60 (94) 20.38 (119) 

1.80 21.27 (24) 21.06 (49) 20.84 (74) 20.62 (99) 20.40 (124) 



1.20 21.16 (5) 20.97 (30) 20.74 (55) 20.52 (80) 20.28 (105) 

1.35 21.20 (10) 21.00 (35) 20.77 (60) 20.55 (85) 20.33 (110) 

1.50 21.23 (15) 21.02 (40) 20.79 (65) 20.58 (90) 20.36 (115) [Fc/H] = 0.00 

1.65 21.2 (20) 21.04 (45) 20.82 (70) 20.60 (95) 20.38 (120) 

L80 21.27 (25) 21.07 (50) 20.84 (75) 20.63 (100) 20.41 (125) 



smoothed with linear mixed models and predicted values 
used for analysis. 

To facil i tate c omparison with the frequentist results of 
iDecin et al. I (|2004l ). the ranks listed in subsequent tables and 
figures are in accordance with the rebinned band 1A data of 
a Boo, used by these authors. 

7.1 Determination of stellar parameter ranges 

Results for the best ten models, as well as for the 
models which ranked 15, 25, 50, 75, 100, and 125, are 
given in Table H Model 38 has lowest T (m) (y,/i) 
value (T cff =4230K, log g=1.50dex, [Fe/H] = -0.30 dex) 
with T (38) (y,^) = 490.1. Model 125 (T cff =4440K, log 
g = 1.80 dex, [Fe/H] = —0.00 dex) has the highest value with 
T (125) (i/,^) = 1144.0. Posterior means as calculated using 
()13[) and 95 % credible intervals, the Bayesian analog to con- 
fidence intervals, are presented in Fig. [3] Fig. [3] shows the 
density estimate for the posterior distribution of r' m ' (y, n). 
The density of T (81) (y, (j,), ranking 10th with T cff = 4370 K, 
log g = 1.35 dex, [Fe/H] = —0.70 dex, is located to the right, 
relative to the densities of the other top five models, un- 
derscoring a goodness-of-fit superior to that of model 81, 
even though the 95 % credible intervals do overlap. The 
model-dependent parameter ranges as estimated from the 
top 10 models in our Bayesian analysis range between 4160 



Table 3. Measures for the goodness-of-fit T/v for some selected 
models. The model was estimated using the predicted value of the 
linear mixed model for the variance functions. The expected loss 
values G(m) are gi ven in units of 10 6 . The ranks are chosen for 
ease of reference to \Decin et al\ \200A) . 



Expected 



Rank 


Model 


T e ff 


log g 


[Fe/H] 


T N 


loss G(m) 


1 


62 


4300 


1.50 


-0.50 


491.0 


3.403 


2 


38 


4230 


1.50 


-0.30 


490.1 


3.394 


3 


82 


4370 


1.35 


-0.50 


493.2 


3.395 


1 


61 


4300 


1.50 


-0.70 


494.3 


3.403 


5 


58 


4300 


1.35 


-0.30 


495.0 


3.395 


6 


41 


4230 


1.65 


-0.70 


495.9 


3.420 


7 


102 


4440 


1.20 


-0.50 


499.6 


3.405 


8 


14 


4160 


1.50 


-0.15 


497.4 


3.413 


9 


42 


4230 


1.65 


-0.50 


502.0 


3.422 


10 


81 


4370 


1.35 


-0.70 


503.8 


3.422 


15 


86 


4370 


1.50 


-0.70 


508.1 


3.469 


25 


15 


4160 


1.50 


0.00 


515.1 


3.487 


50 


9 


4160 


1.35 


-0.15 


566.4 


3.607 


75 


11 


4160 


1.50 


-0.70 


684.8 


3.970 


100 


117 


4440 


1.65 


-0.50 


937.4 


4.995 


125 


125 


4440 


1.80 


0.00 


1144.0 


5.728 
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Figure 3. Posterior means and 95% credible intervals for T(y, fi) 
for 12 models in band 1A. 



and 4440 K for the effective temperature, between 1.20 
and 1.65 dex for the logarithm of the gravity and between 
—0.70 and —0.15 dex for the metallicity. It will be shown in 
Sect. l7.3l that the variability reflected in such ranges can use- 
fully be combined with the internal error to produce relevant 
measures of total variability, meaning that the variability 
which would follow if the true model were known is com- 
bined with variability resulting from uncertainty about the 
model itself. Note that, by using the frequentist approach of 
iDecin et al.l i|2004h , the same set of models was selected us- 
ing the band 1A ISO-SWS data of a Boo, i.e., the inclusion 
of the systematic and statistical errors in the (Bayesian) 
analysis does not lead to different parameter ranges. This 
point is taken up in the Discussion. 

7.2 Expected squared error loss 

Model 38 (T cff = 4230K, log 5 = 1.50 dex, [Fe/H] = -0.30 
dex) has the smallest value for Z/3 8 =3.394 x 10 6 while model 
125 reaches the highest value, L\ 2h = 5.7828 x 10 6 . Figs. \5\ 
and [6] show the observed spectrum, the synthetic spectrum, 
and the posterior mean calculated from (I13|l . for models 38 
and 125. For model 38, the posterior mean and the observed 
spectrum closely agree along the entire wavelength range. 
The discrepancies are larger for model 125. Note how the 
posterior mean for the spectrum always lies between the ob- 
served and synthetic spectra. It is also clear for both models 
that the observed spectrum is more dominant at the end of 
band 1A. Especially for model 125 (Fig. [6}, the posterior 
mean and the observed spectrum become closer when ap- 
proaching the end of the band. Based on this model selection 
criterion, model 38 with stellar parameters T c fj = 4230 K, log 
g = 1.50 dex and [Fe/H] = —0.30 dex is selected as providing 




400 450 500 550 600 

T(y,mu) 



Figure 4. Kernel density estimate for the posterior distribution 



the best representation of the band 1A ISO-SWS data of a 
Boo. 



7.3 Determination of confidence intervals 

Fig.[7]shows the posterior means and the 95% credible inter- 
vals for log(cr 2 ) and log(<rf f ), as well as the shrinkage factor 
determined by a linear mixed model. The variance function 
for both a and om is substituted into the hierarchical model. 

As was explained in Sects. 13.21 and [5] we had to re- 
strict calculation of the synthetic spectra to a well-defined 
grid, w ith spacing determined by the analysis of IDecin et al.l 
(2000). However, as we only have the values for the 
predefined grid points, the accuracy of the derived param- 
eter range for T c ff, log g, and [Fe/H] is bounded by the 
grid spacing. To estimate the confidence intervals around 
the stellar parameters and to test the sensitivity of the stel- 
lar parameters to 2.38-2.60 /im IR data of a Boo, we have 
constrained the choice of the stellar model and its descrip- 
tive parameters by investigating the behaviour of interpo- 
lated ste llar models. This kind of pr ocedure was also fol- 
lowed bv lGriffin fc Lvnas-Gravl (jl999T ). who have simulated 
a non-linear analytic function to the interpolated model flux 
for the purpose of a (frequentist) least-square analysis. 

We chose not to interpolate between the synthetic spec- 
tra in the grid, but rather to calculate the stratification of a 
theoretical atmosphere model of intermediate mass, gravity 
or effective temperature by interpolating between theoret- 
ical models in the existing grid, and then to compute the 
corresponding synthetic spectrum. One may argue that for 
the type of medium-resolution spectra we are dealing with 
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Figure 5. Model 38. Ratio of the posterior mean for the syn- 
thetic spectrum of model 38 to the observed spectrum of a Boo 
(T off =4230K, log g = 1.50dex, [Fe/H] = -0.30 dex) and poste- 
rior mean for the spectrum (full line) in band 1A. 



8 - 
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Figure 6. Model 125. Ratio of the posterior mean for the syn- 
thetic spectrum of model 125 to the observed spectrum of a Boo 
(T cff =4440K, log g = 1.80 dex, [Fe/H] = 0.00 dex) and posterior 
mean for the spectrum (full line) in band 1A. 



the difference between the two approaches, i.e., interpola- 
tion between the synthetic spectra of the existing grid ver- 
sus computation of new synthetic spectra from interpolated 
theoretical model structures of intermediate Q, will be neg- 
ligible. However, our purpose is to develop a general tool 
which, for example, may also be used for observed high- 
resolution spectra. Additionally, since spectral lines behave 
very non-linearly due to saturation, blending, complex de- 
pendency on the (molecular) opacities for cool-star atmo- 
spheres,. . . interpolating between synthetic spectra should 
be avoided. For the purpose of the interpolation between the 
models, the quantities as T (temperature), log P e (electron 
pressure), log P 9 (gas pressure), log a ra( j (radiative acceler- 
ation), and logK (extinction coeffic ient) were interpolated 
linearly on log g or [Fe/H] (see e.g. iPlezll 1992ft . To interpo- 
late in T e g, the temperature distribution T ncw (r) was scaled 
as T„ew(r) = (TST7T°Jf d ) * T old (r), followed by a pressure 
integration to calculate the proper P e , P sr . . To judge upon 
the accuracy, we have interpolated between T c g = 4230 K 
and 4370 K to obtain T cff = 4300K, between log g = 1.35 
and 1.65 to obtain log g = 1.50, and between [Fe/H] = —0.30 
and —0.70 to obtain [Fe/H] = —0.50 and have compared the 
interpolated model structures (and resulting synthetic spec- 
tra) with the existing models (and spectra) from the grid. 
The largest difference occurs for the model with the interpo- 
lated metallicity ([Fe/H] = —0.50) augmenting to 5 % for P 9 
at the outermost layer of the atmosphere model. This how- 



ever only yields a discrepancy between the original theoret- 
ical spectrum and the one calculated from this interpolated 
model of maximum 0.1 % for a resolution of 1500 (while for 
a high-resolution spectrum of AA = 0.5 A, this augments to 
0.55%), proving the accuracy of our interpolation. Subse- 
quently, we performed a 1-dimensional interpolation for the 
parameter values f2 of the selected top 10 models. The pa- 
rameter spacing for the interpolated grid was AT e ff = 5K, 
Alog g = 0.01 dex, and A [Fe/H] = 0.01 dex. Synthetic spec- 
tra for these interpolated fl were then computed. 

Two comments are in place. First, the reduction of the 
number of models to the best 10 is not an intrinsic fea- 
ture of the Bayesian method. Rather, having conducted the 
aforementioned frequentist analyses, such knowledge can be 
incorporated into the Bayesian analysis by way of expert 
priors. In addition, the choice for interpolation is not intrin- 
sically linked to the Bayesian method neither, but rather 
should be viewed as one of the building blocks of our pro- 
posed method. 

Confidence intervals for each of the three parameters 
were obtained by calculating the profile posterior likelihood 
for each of the interpolated models, by holding the two pa- 
rameters fixed and using the interpolated grid over the third 
parameter. In total, 27 interpolated models for T e fi, 29 in- 
terpolated models for log g, and 39 interpolated models for 
[Fe/H] are constructed around one model. Let, for example, 
Gk, k — 1, . . . , 27, be the profile log-likelihood in tempera- 
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(a): log(STDEV-tag"2) in band 1A (b): log(SPARE-tag"2) in band 1A 



Table 4. Posterior maximum profile likelihood and interval esti- 
mates for T e ff , log g and [Fe/H] for model 62. 
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Figure 7. Variance functions. The models were fitted by applying 
a linear mixed model for the data. Panel a: log(STDEV-tag) 2 in 
band 1A with the estimated model and 95 % credible intervals. 
Panel b: log(SPARE-tag) 2 in band 1A with the estimated model 
and 95 % credible intervals. Panel c: shrinkage factor in band 1A 
with the estimated model and 95 % credible intervals. 



ture for the fcth interpolated model. Gk is conditioned upon 
the values of log g and [Fe/H]. The normalized profile log- 
likelihood is given by 
„ _ G k - min(Gfc) 

max(Gfc) — min(Gfc) ' 
The interval estimate for T c ff, log g or [Fe/H] is the set 
of all values of T e ff, log g or [Fe/H] for which the normal- 
ized profile likelihood exceeds 0.9. Table [4] and Fig. [8] ex- 
hibit a typical example for determining the confidence in- 
tervals (here, for model 62, having rank 1 when considering 
all evidence combined, both provided here and assembled 
from the literature). For all of the top 10 models, the range 
in the 90 % confidence intervals is ~ 50 K in temperature, 
~ 0.1 dex in log g and ~ 0.2 dex in [Fe/H]. These values 
thus specify the precision by which the stellar parameters 
can be determined, including all sources of variability. As 
a consequence, the best set of stellar parameters with the 
associated model-dependent internal error estimates for a 
Boo consists of T off = 4230 ± 25 K, log g = 1.50 ± 0.05 dex, 
and [Fe/H] = -0.30 ± 0.10 dex. 

Our estimates assume that the model from which they 
are calculated is the correct one. Importantly though, this 
model itself is subject to uncertainty, illustrated by the fact 
that not a single model but, say, 10 models (Table [3J are 
reasonable candidates. Constructing ranges from such a col- 
lection of models is useful in its own right, but the informa- 
tion contained therein should ideally be translated into an 
additional variance term, to be added to the internal stan- 
dard errors. This can formally be done by considering the 



ParameterMaximum (90% C.I.) 

T off 4295 (4273; 4323) 

log g 1.47 (1.415; 1.52) 

[Fc/H] -0.57 (-0.67; -0.48) 



(b) 




4240 4280 4320 4360 
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Figure 8. Profile likelihood for model 62. Panel (a): profile like- 
lihood for T ef f, panel (b): profile likelihood for log g, and panel 
(c): profile likelihood for [Fe/H]. 



total variability surrounding a parameter estimate p: 

Var(/3) = E[Var0)\M] + Vax[E0)\M], 

where A4 represents 'model'. The first term on the right 
hand side is the internal variance estimate, and is consis- 
tently estimated by the method outlined above. The sec- 
ond term stands for the variability across models. When 
choosing, for example, the best 10 models as a represen- 
tative set, one merely needs to calculate the sample vari- 
ance of the corresponding 10 estimates. For T c g, one ob- 
tains 6312.2, added to 25 2 , yielding 6937.2 and producing an 
improved standard error: T e g = 4230 ± 83 K. For the other 
two quantities, the corresponding improved error estimates 
are: log g = 1.50 ± 0.15 dex, and [Fe/H] = -0.30 ± 0.21 dex. 
These error estimates are l arger than those obtained by 
iGriffin fc Lvnas-Gravl (|l999l ), who ignored the between- 
model variability. 
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8 DISCUSSION 

8.1 Comparison with other statistical methods 

The proposed Bayesian method can compete with other 
methods used nowadays for the evaluation of stellar spec- 
tra for deducing stellar parameters. To see this, we discuss 
in this section historical work in the same field and analyse 
sources of involved errors. 

It would indeed be most convincing when we could com- 
pare our proposed Bayesian analysis with other (Bayesian) 
methods including both systematic and statistical obser- 
vational error estimates consistently throughout the whole 
analysis of evaluating observational spectra with theoreti- 
cal predicitons. However, as far as we are aware of, it is 
the first time that a statistical method including these spec- 
ifications has been developed and used. Nowadays, state- 
of-the-art Bayesian computational techniques are more and 
more leaping into the astronomical field, however with the 
main purpose to detect a line in a spectral m odel or a 
source above background l|Protassov et al.l |2002| . and ref- 
erences therein), to auto matically classify stellar spectra 
Cheeseman fc Stutzlll996h . to analyze Poisson count data 
Kraft et al.lll99ll). to analyze e vent arrival times of peri- 



odicity dGregorv fc Loredolfl9 92). to analyze helioseismol 



ogy data jMorrow fc Brownlll98£ ), to deconvo lve astrophys- 
ical images~i |Gull 19891 ). Ivan Dvk et al.l (|l999l ) were the only 
ones who have employed Bayesian techniques to analyse 
low-count, high-resolution astrophysical spectral data. They 
however have modelled the source energy spectrum as a mix- 
ture of several Gaussian line profiles and a generalized linear 
model which accounts for the continuum, i.e., one assumes 
that a transformation (e.g., log) of the model is linear in a 
set of independent variables, and they have not computed 
a full theoretical atmosphere model and corresponding syn- 
thetic spectrum. 

The frequentist approach is the method most often used 
by astronomers. The basic approach for modelling data in 
both the Bayesian and the frequentist case is the same, the 
main difference being that a frequentist route often is more 
elaborate than its Bayesian counterpart: (i) one chooses or 
designs a figure- of -merit function yielding at the end best- 
fit parameters, (ii) one assesses the appropriateness of the 
estimated parameters from a goodness-of-fit analysis, and 
(iii) one finally tries to determine the likely errors, in an 
ad hoc fashion, for the best-fitting parameters. A few com- 
ments are in place: (1) many practitioners never proceed 
beyond item (i), (2) there are numerous instances of in- 
appropriate use of frequentist methods since practitioners 
may fail to account for a method's statistical limitations, 
calling substa ntive scientific results into question (as nicely 
illustrated by iProtassov et al. I l2002t ). and (3) many statisti- 
cal methods, Bayesian and frequentist alike, are designed 
for use with closed-form expression. A few ex ample s us- 



ing th is kind of frequentist approac h include Katz et aD 
lll998h: [Griffin fc Lvnas-Gravl (Il999h: ICami et all (|200(t ): 
de Bruvne et alj <|2003h :" iDecin et al.l (|2004T ). A nice example 
in which a linear regression method has been developed for 
the analysis of astronomical data with measurement errors 
and in trinsic scatter can be found in lAkritas fc Bershadvl 
(1996). As stated before, two important conditions made 
us shift away from frequentist methods: (1) the inclusion of 
both statistical and systematic measurement uncertainties, 



and (2) the non- availability of a closed analytic formula to 
represent the stellar spectrum. 

8.2 On the application to the case-study of a Boo 

Table [5] summarizes a comprehensive literature study on 
the estimated stellar parameters of our case-study a Boo. 
A more elaborate version of this table, listing addition- 
ally other parameters such as the luminosity, the mass, 
the 12 C/ 13 C-ratio, and a short description of the methods 
and/or data us ed by the va r ious authors, can be found in 
the appendix of lDecin et alj rt2000t). The table has been up- 
dated with the results of IKrticka fc Stefll l| 19991 ) , IDecin et all 
(2004), and this study, the only ones using spectrum fitting 
to determine the stellar parameters for a Boo, during the 
past seven years. Provided that error estimates are given by 
the authors, they are listed in Table [5] It is clear that many 
authors do not provide estimates of precision. Second, those 
who do so typically do not distinguish between the sources of 
imprecision accounted for, w ith t he noteworthy exce ption of 
iGriffin fc Lvnas-Gravl (|l999h and lDecin et~al] l|2004h . These 
considerations underscore the usefulness of our method. 

Authors using spectroscopic requirements (i.e., ionisa- 
tion balance, independence of the abundance of an ion versus 
the excitation potential and equival ent width) to estimate 
the st ell ar parameters fo r a B oo are van Paradiis fc Meurs! 



Jl974h. iMackle et all (Il975l) lLambert fc Riesl dl98ll). 
iBell et alj (|l985h . lEdvardssonl j| 19881 ). and iBonnell fc Belli 



( 19931 ). From these results, we infer that the values for T e ff 
range between 4260 and 4490 K, for log g ranging between 
0.90 and 2.01 dex, and for [Fe/H] ranging from —0.56 to 
— 0.60dex. The maximum quoted uncertainties are 100 K, 
0.46 dex and 0.14 dex, respectively, although it is not always 
clear whether the authors mention an internal or external 
error estimate. As has b een pointed out by, for example, 
ISmith fc Lambert! <|l985h . one can easily assess an exter- 
nal error estimate by varying the derived parameter values. 
This normally results in AT e fi ~ 200K, A log g m 0.2 dex and 
A [Fe/H] « 0.2 dex. 

Only few authors used one or other form of spec- 
trum fitting method to est i mate s tel lar parameters, a mongs t 
them IScargle fc Streckerl (Il979l). IManduca et al.l (Il98ll). 
iPeterson etafl (ll993l)."lKrticka fc Stefll (| 19991 ) . IDecin et all 
|2003l ). and IDecin et al.l l|2004l) . In the first two of these 
manuscripts, the effective temperature was determined from 
the flux-curve shape alone, while in the others a part of the 
observational spectrum either in the visible or in the near- 
infrared was used. Values for T e ff range between 4060 and 
4390 K (with a maxim um quoted uncertainty of 435 K from 
IManduca et~ai1 (|l98ll )). for log g between 1.5 and 2.0 dex 
(with maximum uncertainty 0.2 dex) and for [Fe/H] between 
—0.27 and — 0.50 dex (with maximu m uncertainty 0.1 dex). 
According to IKrticka fc Stefll (1999), the different estimates 
for the stellar parameters as determined from different spec- 
tral regions using a minimum least-square analysis range be- 
tween 4200 and 4600 K for T cff , between 1.53 and 2.35 for log 
g and between —0.155 and —0.461 for [Fe/H]. |Peterson et al.l 
(l993i ) tabulated as results: T cff = 4300 ± 30 K, log g = 1.5 ± 
0.2 dex, and [Fe/H] = —0.5 ± 0.1 dex. We could however not 
trace back if the quoted error estimates i nclude external 
errors or only internal uncertai nties. Only IKrticka fc Stefll 
(jl999t ): lDecin et alj (|2003l . |2004| ) have applied a frequentist 
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Table 5. Literature study of a Boo: the columns tabulate the effective temperature in Kelvin, the logaritm of the gravity in cm/s 2 , and 
the metallicity, respectively. Values assumed or adopted are given in parenthesis. An error estimate is listed whenever provided by the 
authors. 



log g 



[Fe/H] 



Reference 



4350 ± 50 


1.95 ± 0.25 


—0.5 


42oU ± 5U 


u.yu ± U.OO 




A A i n _i_ on 
441U ± oU 






4240 






40o0 ± loU 






a a on I i rn 

4420 ± 150 






(4260) 


(1.6) 




4490 ± 100 


2.01 ± 0.46 


—0.56 ± 0.07 


a o rn i a o r 

4350 ± 435 






4205 ± 150 






4375 ± 50 


(1.5) 


/ n r\ 

(-0.5) 


4350 


1.8 


/ n ri\ 

(-0.51) 


a a on _L onn 

44yu ± zuu 


Z.O ± U.O 


— U.OO ± U.oU 


4370 






(4375) 


(1.57) 




(4375) 


1.6 ± 0.2 


—0.5 


(4410) 


(> 0.98) 


(—0.50) 


(4225) 


1.6 ± 0.2 


(—0.56) 


4400 


1.7 


—0.6 


(4o (0) 


1 r 1 n r 
1.0 ± U.O 




/ a Qnn\ 
(4oUU ) 


(1.74) 




4294 ± 30 






(4375) 


1.97 ± 0.20 


—0.42 


4oZl 


(1.8) 


(,-0.51) 


4340 


1.9 


—0.39 


4294 ± 30 






4300 


2.0 


—0.69 ± 0.10 


/ioon 1 nnn 

4280 ± 200 


2.19 ± 0.27 


—0.60 ± 0.14 


4362 ± 45 






4250 ± 80 


1.6 ±0.3 




(4375) 


1.5 ± 0.5 




4265 






4450 


1.96 — 1.98 


—0.5 


4350 


1.71 — 1.73 


—0.5 


4250 


1.43 — 1.44 


—0.5 


4250 


1.81 — 1.82 


0.0 


4300 ± 30 


1.5 ± 0.2 


-0.5 ± 0.1 


(4260) 


(0.9) 


(-0.77) 


(4420) 


(1.7) 


(-0.50) 


4362 


2.4 




4303 ± 47 






(4375) 


(1.5) 




4300 


1.4 


-0.47 


4291 ± 48 






4255 






4628 ± 210 






4320 






4321 ± 44 




-0.547 ±0.021 


4290 ± 30 






4291.9 ±0.7 


1.94 ±0.05 


-0.68 ±0.02 


4390 ± 90 


2.0 ±0.2 


-0.27 ±0.05 


4320 ± 140 


1.50 ±0.15 


-0.50 ±0.20 


4160 - 4300 


1.35 - 1.65 


-0.30 - 0.00 


4230 ± 83 


1.50 ±0.15 


-0.30 ±0.21 



van Paradiis 
MackleetaJ. 
Blackwell fc 



& Meurs 
. (1975) 
Shallis (1977) 



Linskv fc Avres (1978) 
Scargle fe Strecker (19791 
Blackwell et al. (1980) 
Lambert et al. (19801 
Lambert fe Ries (19811 
Manduca et al. (1981) 



Tsuii (1981) 
Frisk et al. (19821 
Kiagrgaard et al. (1982) 
Burnashev (19831 
Burnashev (19831 
Harris fc Lambert (19841 



Bell et al. (1985) 
Gratton (19851 
Judge (19861 
Kvrolaincn et al. (1986) 
Tsuii (19861 
Altas (19871 

di Benedetto fc Rabbia (1987 1 

Edvardsson (19881 

Bell fc Gustafsson (19891 




Fernandez- Villacahas et al. (1990) 
McWilliam (1990) 
Blackwell et al. (19911 
Judge fc Stencel (1991 1 



Tsuii (1991) 
Engelke (19921 
Bonnell fc Bell (19931 
Bonnell fc Bell (19931 
Bonnell fc Bell (19931 
Bonnell fc Bell (19931 
(1993) 



Petersonetal. 
Gadun (19941 
Gadun (19941 
Cohen et al. (19961 
Quirrcnbacli i et_al i (1996) 
Aoki fc Tsuii (19971 
Pilachowski et al. (19971 
di Benedetto (19981 
di Benedetto (19981 
Dvck et al. (19981 
Hammcrslev_et_aL (1998) 
Perrin et al. (19981 
Taylor (19991 
Griffin & Lynas-Grav 
Griffin & Lynas-Gray 
Krticka fc Stefl (19991 
Decin et al. (20001 
Decin et al. (20041 
This paper 



*: model-independent external errors; **: model-dependent internal errors 
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least-square method to optimize the stellar parameters for 
a Boo using spectrum fitting. None of them included sys- 
tematic and statistical error estimates. 

Including both error sources, a and um does not re- 
sult in different ranges for the fundamental ste llar parame- 
ters T e ff, log g and [Fe/H] of a Boo, relative to lDecin et al.l 
|2004h . even though the latter authors did not take measure- 
ment errors into account. Possibly, the error measurements 
on the different data points are smaller than the difference 
between the observational data and even the best model, 
which then would not result in gain of evidence when includ- 
ing the errors. Comparing the ratio of the observational data 
to the synthetic data of model 62 (having rank 1) with a and 
om , we note that all of them have the same order of magni- 
tude. This also indicates that the remaining stru cture when 
considering y(t)/9^ 62) (t), as in lDecin et alj l|2004 ). is not due 
to measurement uncertainties but rather indicates that some 
pattern in the observational data is not captured by the the- 
oretical predictions. Plausible explanations for this are: (1) 
the fact we kept the C (carbon), N (nitrogen), and O (oxy- 
gen) abundance and the microturbulence fixed, (2) problems 
with the temperature distribution in the outermost layers of 
the model photosphere leading to an underestimation of the 
low-level vibration-rotation lines of CO (carbon monoxide), 
and (3) problems with the data reduction. 



9 CONCLUSIONS AND FUTURE PROSPECTS 

Estimating the stellar atmospheric parameters from an ob- 
served spectrum with given error estimates entails a model 
selection task in which we had to select a synthetic spectrum 
from a collection of 125 models. Frequentist methods based 
on the Kolmogorov-Smirnov test and \ 2 statistics to assess 
the goodness-of-fit are unable to incorporate the so-called 
statistical and systematic measurement errors of the obser- 
vational data into the analysis. Our hierarchical Bayesian 
model with a normal model for the likelihood and conjugate 
normal prior is capable of taking both of these errors into 
account. Using the Bayesian weighted % 2 statistics to assess 
the goodness-of-fit, the results based on the 2.38-2.60 fim 
ISO-SWS data of a Boo are as follows: T e fi ranges between 
4160 and 4440 K, log g ranges between 1.20 and 1.65 dex 
and [Fe/H] ranges between —0.15 and —0.70 dex. For the 
model selection process, we have used the predictive squared 
error loss function. The parameters of the model with the 
best representation of the ISO-SWS data are T ef f = 4230 K 
±83 K), log g= 1.50 dex ±0.15 dex), and [Fe/H] = -0.30 dex 
±0.21 dex). 

Not only here but for a range of applications it is conve- 
nient to first rank the synthetic spectra in the grid, without 
including a and am- When including the observational er- 
rors, one then does not have to apply the Bayesian analysis 
to all models, like the 125 considered here, but only to a se- 
lection of models that are of interest, e.g., the models which 
have the highest ranks and perhaps a few other models which 
have a poor goodness-of-fit. 

It would be of interest, though outside of the scope of 
this paper, to apply the proposed meth od to (1) a la r ger se t 
of standard stellar candles analysed in iDecin et al.l l|2003h . 
(2) a 7-dimensional grid, in which not only the effective 
temperature, the gravity and the metallicity are variable, 



but also the carbon, nitrogen and oxygen abundance and 
the microturbulence, and (3) the synthesis analysis of high- 
resolution optical data. 

We emphasize that the hierarchical Bayesian model as 
proposed in this paper is a general method which is able 
to objectively determine the parameter ranges using the 
synthesis technique. In contrast to previous studies, this 
Bayesian method incorporates the systematic and statisti- 
cal measurement error in the analysis of the data, and so 
in the determination of the stellar parameters and their un- 
certainty intervals. A step-by-step algorithmic explanation 
of the Bayesian analysis developed in this paper and the 
source code thereof are available upon request. 
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